Goto

Collaborating Authors

 additive error


AGeneralized Binary Tree Mechanism for Private Approximation of All-Pair Shortest Distances

Neural Information Processing Systems

We study the problem of approximating all-pair distances in a weighted undirected graph with differential privacy, introduced by Sealfon [Sea16]. Given a publicly known undirected graph, we treat the weights of edges as sensitive information, and two graphs are neighbors if their edge weights differ in one edge by at most one. We obtain efficient algorithms with significantly improved bounds on a broad class of graphs which we refer to as recursively separable. In particular, for any n-vertex Kh-minor-free graph, our algorithm achieve an additive error of eO(h(nW)1/3), where W represents the maximum edge weight; For grid graphs, the same algorithmic scheme achieve additive error of eO(n1/4 W). Our approach can be seen as a generalization of the celebrated binary tree mechanism for range queries, as releasing range queries is equivalent to computing all-pair distances on a path graph. In essence, our approach is based on generalizing the binary tree mechanism to graphs that are recursively separable. JL and ZZ have been supported by National Science Foundation of China under Grant No. 62472212 and the New Cornerstone Science Foundation. Supported in part by NSF award 2228995 JU's research was funded by the NSFCNS 2433628, Google Seed Fund grant, Google Research Scholar Award, Dean Research Seed Fund, and Rutgers Decanal Grant no.


Differentially Private Gomory-Hu Trees

Neural Information Processing Systems

Given an undirected, weighted n-vertex graph G = (V,E,w), a Gomory-Hu tree T is a weighted tree on V that preserves the Min-s-t-Cut between any pair of vertices s,t V. Finding cuts in graphs is a key primitive in problems such as bipartite matching, spectral and correlation clustering, and community detection. We design a differentially private (DP) algorithm that computes an approximate Gomory-Hu tree. Our algorithm is ε-DP, runs in polynomial time, and can be used to compute s-tcuts that are O(n/ε)-additive approximations of the Min-s-t-Cuts in Gfor all distinct s,t V with high probability. Our error bound is essentially optimal, since [29] showed that privately outputting a single Min-s-t-Cut requires Ω(n) additive error even with (ε,δ)-DP and allowing for multiplicative error. Prior to our work, the best additive error bounds for approximate all-pairs Min-s-t-Cuts were O(n3/2/ε)for ε-DP [47] and O( mn/ε)for (ε,δ)-DP [66], both achieved by DP algorithms that preserve all cuts in the graph. To achieve our result, we develop an ε-DP algorithm for the Minimum Isolating Cuts problem with near-linear error, and introduce a novel privacy composition technique combining elements of both parallel and basic composition to handle'bounded overlap' computational branches in recursive algorithms, which maybe of independent interest.


Streaming Algorithms and Lower Bounds for Estimating Correlation Clustering Cost

Neural Information Processing Systems

Correlation clustering is a fundamental optimization problem at the intersection of machine learning and theoretical computer science. Motivated by applications to big data processing, recent years have witnessed a flurry of results on this problem in the streaming model. In this model, the algorithm needs to process the input n-vertex graph by making one or few passes over the stream of its edges and using a limited memory, much smaller than the input size. All previous work on streaming correlation clustering has focused on semistreaming algorithms with Ω(n) memory, whereas in this work, we study streaming algorithms with much smaller memory requirements of only polylog(n) bits. This stringent memory requirement is in the same spirit of classical streaming algorithms that instead of recovering a full solution to the problem--which can be prohibitively large with such small memory as is the case in our problem--, aimed to learn certain statistical properties of their inputs.


k-Median Clustering via Metric Embedding: Towards Better Initialization with Differential Privacy

Neural Information Processing Systems

We propose a new initialization scheme for the k-median problem in the general metric space (e.g., discrete space induced by graphs), based on the construction of metric embedding tree structure of the data. We propose a novel and efficient search algorithm which finds initial centers that can be used subsequently for the local search algorithm. The so-called HST initialization method can produce initial centers achieving lower error than those from another popular method k-median++, also with higher efficiency when k is not too small. Our HST initialization are then extended to the setting of differential privacy (DP) to generate private initial centers. We show that the error of applying DP local search followed by our private HST initialization improves prior results on the approximation error, and approaches the lower bound within a small factor. Experiments demonstrate the effectiveness of our proposed methods.